Evaluation of de novo transcriptome assemblies from RNA-Seq

نویسندگان

  • Bo Li
  • Nathanael Fillmore
  • Yongsheng Bai
  • Mike Collins
  • James A. Thomson
  • Ron Stewart
  • Colin N. Dewey
چکیده

For simplicity, we assume that RNA-Seq reads are sequenced uniformly across the transcriptome. In addition, we only consider fixed length single-end RNA-Seq reads. We denote the read length as L. Given a set of MT transcripts, we denote the relative expression levels as τ = (τ1, τ2, . . . , τMT ), the read generating probabilities as Θ = (θ0, θ1, . . . , θMT ) and the expected read coverage as Ξ = (ξ0, ξ1, ξ2, . . . , ξMT ). We further denote the lengths of transcripts as L = (l0, l1, . . . , lMT ) and let l0 = L. Here transcript 0 refers to a non-existing “noise” transcript. It represents all reads that are generated from the background noise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

A Comparison of Next Generation Sequencing Technologies for Transcriptome Assembly and Utility for RNA-Seq in a Non-Model Bird

De novo assembled transcriptomes, in combination with RNA-Seq, are powerful tools to explore gene sequence and expression level in organisms without reference genomes. Investigators must first choose which high throughput sequencing platforms will provide data most suitable for their experimental goals. In this study, we explore the utility of 454 and Illumina sequences for de novo transcriptom...

متن کامل

A practical guide to build de-novo assemblies for single tissues of non-model organisms: the example of a Neotropical frog

Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the t...

متن کامل

Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies

BACKGROUND De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly us...

متن کامل

Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome

Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo ass...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014